324 8.2 Molecular Simulation Methods
Monte Carlo techniques rely on repeated random sampling to generate a numerical
output for the occurrences, or not, of specific events in a given biological system. In
the most common applications in biology, Monte Carlo methods are used to enable sto
chastic simulations. For example, a Monte Carlo algorithm will consider the likelihood
that, at some time t, a given biological event will occur or not in the time interval t to t
+ Δt where Δt is a small but finite discretized incremental time unit (e.g., for combined
classical MD methods and Monte Carlo methods, Δt ~ 10−15 s). The probability Δp for
that specific biological event to occur is calculated using specific biophysical parameters
that relate to that particular biological system, and this is compared against a probability
prand that is generated from a pseudorandom number generator in the range 0–1. If Δp
> prand, then one assumes the event has occurred, and the algorithm then changes the
properties of the biological system to take account of this event. The new time in the sto
chastic simulation then becomes t′ = t + Δt, and the polling process in the time interval
t′ to t′ + Δt is then repeated, and the process then iterated over the full range of time to
be explored.
In molecular simulation applications, the trial probability Δp is associated with trial moves
of an atom’s position. Classical MD, in the absence of any stochastic fluctuations in the simu
lation, is intrinsically deterministic. Monte Carlo methods, on the other hand, are stochastic
in nature, and so atomic positions evolve with a random element with time (Figure 8.2).
Pseudorandom small displacements are made in the positions of atoms, one by one, and
the resultant change ΔU in potential energy is calculated using similar empirical potential
energy described in the previous section for classical MD (most commonly the Lennard–
Jones potential). Whether the random atomic displacement is accepted or not is usually
determined by using the Metropolis criterion. Here, the trial probability Δp is established
using the Boltzmann factor exp(−ΔU/kBT), and if Δp > the pseudorandom probability (prand),
then the atomic displacement is accepted, the potential energy is adjusted and the pro
cess iterated for another atom. Since Monte Carlo methods only require potential energy
calculations to compute atomic trajectories, as opposed to additional force calculations, there
is a saving in computational efficiency.
This simple process can lead to very complex outputs if, as is generally the case, the bio
logical system is also spatially discretized. That is, probabilistic polling is performed on
spatially separated components of the system. This can lead to capturing several emergent
properties. Depending on the specific biophysics of the system under study, the incremental
time Δt can vary, but has to be small enough to ensure subsaturation levels of Δp that is,
Δp < 1 (in practice, Δt is often catered so as to encourage typical values of Δp to be in the
range ~0.3–0.5). However, very small values of Δt lead to excessive computational times.
Similarly, to get the most biological insight ideally requires the greatest spatial discretization
of the system. These two factors when combined can result in requirements of significant
parallelization of computation and often necessitate supercomputing, or high performance
computing (HPC) resources when applied to molecular simulations. However, there are still
several biological processes beyond molecular simulations that can be usefully simulated
using nothing more than a few lines of code in any standard engineering software language
(e.g., C/C++, MATLAB®, Fortran, Python, LabVIEW) on a standard desktop PC.
A practical issue with the Metropolis criterion is that some microstates during the
finite duration of a simulation may be very undersampled or in fact not occupied at all.
This is a particular danger in the case of a system that comprises two or more inter
mediate states of stability for molecular conformations that are separated by relatively
large free energy barriers. This implies that the standard Boltzmann factor employed
of exp(−ΔU/kBT) is in practice very small if ΔU equates to these free energy barriers.
This can result in a simulation being apparently locked in, for example, one very over
sampled state. Given enough time of course, a simulation would explore all microstates
as predicted from the ergodic hypothesis (see Chapter 6); however, practical computa
tional limitations may not always permit that.
To overcome this problem, a method called “umbrella sampling” can be employed. Here a
weighting factor w is used in front of the standard Boltzmann term: